Stochastic Complexity Based Estimation of Missing Elements in Questionnaire Data
نویسندگان
چکیده
In this paper we study a new information-theoretically justified approach to missing data estimation for multivariate categorical data. The approach discussed is a model-based imputation procedure relative to a model class (i.e., a functional form for the probability distribution of the complete data matrix), which in our case is the set of multinomial models with some independence assumptions. Based on the given model class assumption an information-theoretic criterion can be derived to select between the different complete data matrices. Intuitively this general criterion, called stochastic complexity, represents the shortest code length needed for coding the complete data matrix relative to the model class chosen. Using this information-theoretic criteria, the missing data problem is reduced to a search problem, i.e., finding the data completion with minimal stochastic complexity. In the experimental part of the paper we present empirical results of the approach using two real data sets, and compare these results to those achived by commonly used techniques such as case deletion and imputating sample averages.
منابع مشابه
Capacity Drop Estimation Based on Stochastic Approach Applied to Tehran-Karaj Freeway
Existence of capacity drop phenomenon, as the difference between pre-queue and queue discharge flow rates, has been one of the controversial concepts of traffic engineering. Several researches have focused on capacity drop existence and also its estimation issues. This paper aims to estimate capacity drop based not only on a comparison between breakdown and queue discharge flow rates, but also ...
متن کاملOn Calibration and Application of Logit-Based Stochastic Traffic Assignment Models
There is a growing recognition that discrete choice models are capable of providing a more realistic picture of route choice behavior. In particular, influential factors other than travel time that are found to affect the choice of route trigger the application of random utility models in the route choice literature. This paper focuses on path-based, logit-type stochastic route choice models, i...
متن کاملJoint Bayesian Stochastic Inversion of Well Logs and Seismic Data for Volumetric Uncertainty Analysis
Here in, an application of a new seismic inversion algorithm in one of Iran’s oilfields is described. Stochastic (geostatistical) seismic inversion, as a complementary method to deterministic inversion, is perceived as contribution combination of geostatistics and seismic inversion algorithm. This method integrates information from different data sources with different scales, as prior informat...
متن کاملPerformance evaluation of different estimation methods for missing rainfall data
There are numerous methods to estimate missing values of which some are used depending on the data type and regional climatic characteristics. In this research, part of the monthly precipitation data in Sarab synoptic station, east Azerbaijan province, Iran was randomly considered missing values. In order to study the effectiveness of various methods to estimate missing data, by seven classic s...
متن کاملRole of Kaplan’s Preference Matrix in the Assessment of Building façade, Case of Gorgan, Iran
Buildings play a key role in organization and arrangement of city appearance. Specially, their facades have profound impact on the quality of urban landscapes while playing an important role in assessing urban environments by citizens. The introduction of superior building facades in terms of popular preferences is mostly based on visual elements of building facades. Furthermore, aesthetic pref...
متن کامل